Ubuntu 10.04 & IBM DS3524 with FC multipath, inactive path is [failed][faulty] instead of [active][ghost]

Posted by Graeme Donaldson on Ask Ubuntu See other posts from Ask Ubuntu or by Graeme Donaldson
Published on 2012-06-01T13:45:46Z Indexed on 2012/06/04 10:51 UTC
Read the original article Hit count: 750

Filed under:

OK, this is my setup:

FC Switches IBM/Brocade, Switch1 and Switch2, independent fabrics.

Server IBM x3650 M2, 2x QLogic QLE2460, 1 connected to each FC Switch.

Storage IBM DS3524, 2x controllers with 4x FC ports each, but only 2x connected on each.

+-----------------------------------------------------------------------+
|               HBA1             Server              HBA2               |
+-----------------------------------------------------------------------+
                 |                                     |
                 |                                     |
                 |                                     |
+-----------------------------+          +------------------------------+
|            Switch1          |          |            Switch2           |
+-----------------------------+          +------------------------------+
         |                |                  |                 |
         |                |                  |                 |
         |                |                  |                 |
         |                |                  |                 |
         |                |                  |                 |
+-----------------------------------+-----------------------------------+        
| Contr A, port 3 | Contr A, port 4 | Contr B, port 3 | Contr B, port 4 |
+-----------------------------------+-----------------------------------+
|                                 Storage                               |
+-----------------------------------------------------------------------+

My /etc/multipath.conf is from the IBM redbook for the DS3500, except I use a different setting for prio_callout, IBM uses /sbin/mpath_prio_tpc, but according to http://changelogs.ubuntu.com/changelogs/pool/main/m/multipath-tools/multipath-tools_0.4.8-7ubuntu2/changelog, this was renamed to /sbin/mpath_prio_rdac, which I'm using.

devices {
  device {
    #ds3500
    vendor "IBM"
    product "1746   FAStT"
    hardware_handler "1 rdac"
    path_checker rdac
    failback 0
    path_grouping_policy multibus
    prio_callout "/sbin/mpath_prio_rdac /dev/%n"
  }
}

multipaths {
  multipath {
    wwid xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    alias array07
    path_grouping_policy multibus
    path_checker readsector0
    path_selector "round-robin 0"
    failback "5"
    rr_weight priorities
    no_path_retry "5"
  }
}

The output of multipath -ll with controller A as the preferred path:

root@db06:~# multipath -ll
sdg: checker msg is "directio checker reports path is down"
sdh: checker msg is "directio checker reports path is down"
array07 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) dm-2 IBM     ,1746      FASt
[size=4.9T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=2][active]
 \_ 5:0:1:0 sdd 8:48  [active][ready]
 \_ 5:0:2:0 sde 8:64  [active][ready]
 \_ 6:0:1:0 sdg 8:96  [failed][faulty]
 \_ 6:0:2:0 sdh 8:112 [failed][faulty]

If I change the preferred path using IBM DS Storage Manager to Controller B, the output swaps accordingly:

root@db06:~# multipath -ll
sdd: checker msg is "directio checker reports path is down"
sde: checker msg is "directio checker reports path is down"
array07 (xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) dm-2 IBM     ,1746      FASt
[size=4.9T][features=1 queue_if_no_path][hwhandler=0]
\_ round-robin 0 [prio=2][active]
 \_ 5:0:1:0 sdd 8:48  [failed][faulty]
 \_ 5:0:2:0 sde 8:64  [failed][faulty]
 \_ 6:0:1:0 sdg 8:96  [active][ready]
 \_ 6:0:2:0 sdh 8:112 [active][ready]

According to IBM, the inactive path should be "[active][ghost]", not "[failed][faulty]".

Despite this, I don't seem to have any I/O issues, but my syslog is being spammed with this every 5 seconds:

Jun  1 15:30:09 db06 multipathd: sdg: directio checker reports path is down
Jun  1 15:30:09 db06 kernel: [ 2350.282065] sd 6:0:2:0: [sdh] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jun  1 15:30:09 db06 kernel: [ 2350.282071] sd 6:0:2:0: [sdh] Sense Key : Illegal Request [current] 
Jun  1 15:30:09 db06 kernel: [ 2350.282076] sd 6:0:2:0: [sdh] <<vendor>> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1
Jun  1 15:30:09 db06 kernel: [ 2350.282083] sd 6:0:2:0: [sdh] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Jun  1 15:30:09 db06 kernel: [ 2350.282092] end_request: I/O error, dev sdh, sector 0
Jun  1 15:30:10 db06 multipathd: sdh: directio checker reports path is down
Jun  1 15:30:14 db06 kernel: [ 2355.312270] sd 6:0:1:0: [sdg] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jun  1 15:30:14 db06 kernel: [ 2355.312277] sd 6:0:1:0: [sdg] Sense Key : Illegal Request [current] 
Jun  1 15:30:14 db06 kernel: [ 2355.312282] sd 6:0:1:0: [sdg] <<vendor>> ASC=0x94 ASCQ=0x1ASC=0x94 ASCQ=0x1
Jun  1 15:30:14 db06 kernel: [ 2355.312290] sd 6:0:1:0: [sdg] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Jun  1 15:30:14 db06 kernel: [ 2355.312299] end_request: I/O error, dev sdg, sector 0

Does anyone know how I can get the inactive path to show "[active][ghost]" instead of "[failed][faulty]"? I assume that once I can get that right then the spam in my syslog will end as well.

One final thing worth mentioning is that the IBM redbook doc targets SLES 11 so I'm assuming there's something a little different under Ubuntu that I just haven't figured out yet.

Update: As suggested by Mitch, I've tried removing /etc/multipath.conf, and now the output of multipath -ll looks like this:

root@db06:~# multipath -ll
sdg: checker msg is "directio checker reports path is down"
sdh: checker msg is "directio checker reports path is down"
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxdm-1 IBM     ,1746      FASt
[size=4.9T][features=0][hwhandler=0]
\_ round-robin 0 [prio=1][active]
 \_ 5:0:2:0 sde 8:64  [active][ready]
\_ round-robin 0 [prio=1][enabled]
 \_ 5:0:1:0 sdd 8:48  [active][ready]
\_ round-robin 0 [prio=0][enabled]
 \_ 6:0:1:0 sdg 8:96  [failed][faulty]
\_ round-robin 0 [prio=0][enabled]
 \_ 6:0:2:0 sdh 8:112 [failed][faulty]

So its more or less the same, with the same message in the syslog every 5 minutes as before, but the grouping has changed.

© Ask Ubuntu or respective owner

Related posts about multipath